Labeling data for classification requires significant human effort. To reducelabeling cost, instead of labeling every instance, a group of instances (bag)is labeled by a single bag label. Computer algorithms are then used to inferthe label for each instance in a bag, a process referred to as instanceannotation. This task is challenging due to the ambiguity regarding theinstance labels. We propose a discriminative probabilistic model for theinstance annotation problem and introduce an expectation maximization frameworkfor inference, based on the maximum likelihood approach. For many probabilisticapproaches, brute-force computation of the instance label posterior probabilitygiven its bag label is exponential in the number of instances in the bag. Ourkey contribution is a dynamic programming method for computing the posteriorthat is linear in the number of instances. We evaluate our methods using bothbenchmark and real world data sets, in the domain of bird song, imageannotation, and activity recognition. In many cases, the proposed frameworkoutperforms, sometimes significantly, the current state-of-the-art MIMLlearning methods, both in instance label prediction and bag label prediction.
展开▼